Picture for Ming-Yu Liu

Ming-Yu Liu

Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Add code
May 05, 2025
Viaarxiv icon

Dynamic Camera Poses and Where to Find Them

Add code
Apr 24, 2025
Viaarxiv icon

Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light

Add code
Apr 23, 2025
Viaarxiv icon

Describe Anything: Detailed Localized Image and Video Captioning

Add code
Apr 22, 2025
Viaarxiv icon

Articulated Kinematics Distillation from Video Diffusion Models

Add code
Apr 01, 2025
Viaarxiv icon

CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

Add code
Mar 27, 2025
Viaarxiv icon

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Add code
Mar 18, 2025
Viaarxiv icon

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Add code
Mar 18, 2025
Viaarxiv icon

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Add code
Mar 03, 2025
Viaarxiv icon

Cosmos World Foundation Model Platform for Physical AI

Add code
Jan 07, 2025
Figure 1 for Cosmos World Foundation Model Platform for Physical AI
Figure 2 for Cosmos World Foundation Model Platform for Physical AI
Figure 3 for Cosmos World Foundation Model Platform for Physical AI
Figure 4 for Cosmos World Foundation Model Platform for Physical AI
Viaarxiv icon